{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:22.850039Z",
     "start_time": "2020-09-10T07:04:22.428209Z"
    }
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Using backend: pytorch\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "import dgl\n",
    "import pickle\n",
    "import numpy as np\n",
    "import scipy.sparse as sp\n",
    "import torch as th\n",
    "import torch.nn.functional as F\n",
    "\n",
    "import utils\n",
    "import models\n",
    "import data_loader"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:22.876715Z",
     "start_time": "2020-09-10T07:04:22.851294Z"
    }
   },
   "outputs": [],
   "source": [
    "os.environ[\"CUDA_VISIBLE_DEVICES\"] = '0'\n",
    "dev = th.device('cuda' if th.cuda.is_available() else 'cpu')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Configuration"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:22.883977Z",
     "start_time": "2020-09-10T07:04:22.877871Z"
    }
   },
   "outputs": [],
   "source": [
    "data_dir = \"../data\"\n",
    "adj_path = os.path.join(data_dir, \"adj_matrix_formal_stage.pkl\")\n",
    "feat_path = os.path.join(data_dir, \"feature_formal_stage.npy\")\n",
    "label_path = os.path.join(data_dir, \"train_labels_formal_stage.npy\")\n",
    "model_dir = \"../saved_models\"\n",
    "adj_norm = True\n",
    "feat_norm = None\n",
    "feat_norm_func = utils.feat_norm(feat_norm)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Data Loading"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:24.396121Z",
     "start_time": "2020-09-10T07:04:22.884784Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Finished data loading.\n",
      "NumNodes: 659574\n",
      "NumEdges: 5757154\n",
      "NumFeats: 100\n",
      "NumClasses: 20\n",
      "NumTrainingSamples: 559574\n",
      "NumValidationSamples: 50000\n"
     ]
    }
   ],
   "source": [
    "dataset = data_loader.KddDataset(adj_path, feat_path, label_path)\n",
    "raw_adj = dataset.adj\n",
    "raw_features = dataset.features\n",
    "raw_labels = dataset.labels\n",
    "train_mask = dataset.train_mask\n",
    "val_mask = dataset.val_mask\n",
    "test_mask = dataset.test_mask\n",
    "size_raw, num_features = raw_features.shape \n",
    "test_size = np.sum(test_mask)\n",
    "size_reduced = size_raw - test_size\n",
    "num_class = raw_labels.max() + 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Attack Solution"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Our attack solution is based on **Adversarial Adjacent Matrix Generation** and **Enhanced Feature Gradiant Attack**. We first generate an attack matrix, then modify attack node features by optimizing a customized attack loss. Note that in previous reseach on graph adversarial attacks, many proposed attacks that modify features and connections at the same time. However, most of these research did experiments on toy datasets, not as big as this one (~100-1000x bigger). When the search space becomes very large, it would be difficult to find the optimal solution (even if it exists, computationally expensive). Hence, we choose to modify connections and features consecutively."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Adversarial Adjacent Matrix Generation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 1: Target Node Selection\n",
    "\n",
    "Since there are strong constraints on attackers ($\\leq500$ nodes, $\\leq100$ edges for each node), it's ineffective to directly connect attack nodes to all test nodes (50000 in total). One connection for one node is obviously not enough considering such a big graph. We should focus on those test nodes that are probably classified correctly by target models, while leaving the nodes that are already difficult to classify. Since labels of test nodes are hidden, we use several models to classify test nodes and find their common predictions. The idea is that if a node is classified to the same class by a variaty of models, it is probably due to its special topology or feature properties. It would be interesting if we can affect these properties. Thus, we select this kind of nodes as targets to attack."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:25.560097Z",
     "start_time": "2020-09-10T07:04:24.397112Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "GCN(\n",
       "  (layers): ModuleList(\n",
       "    (0): GraphConv(in=100, out=64, normalization=both, activation=<function relu at 0x7f5f304d1e50>)\n",
       "    (1): GraphConv(in=64, out=20, normalization=both, activation=None)\n",
       "  )\n",
       "  (dropout): Dropout(p=0, inplace=False)\n",
       ")"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# GCN\n",
    "model_1_name = \"gcn_64_1.pkl\"\n",
    "model_1 = models.GCN(num_features, 64, num_class, 1, activation=F.relu, dropout=0)\n",
    "model_1_states = th.load(os.path.join(model_dir, model_1_name), map_location=dev)\n",
    "model_1.load_state_dict(model_1_states)\n",
    "model_1 = model_1.to(dev)\n",
    "model_1.eval()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:25.565658Z",
     "start_time": "2020-09-10T07:04:25.561019Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "TAGCN(\n",
       "  (layers): ModuleList(\n",
       "    (0): TAGConv(\n",
       "      (lin): Linear(in_features=300, out_features=128, bias=True)\n",
       "    )\n",
       "    (1): TAGConv(\n",
       "      (lin): Linear(in_features=384, out_features=20, bias=True)\n",
       "    )\n",
       "  )\n",
       "  (dropout): Dropout(p=0, inplace=False)\n",
       ")"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# TAGCN\n",
    "model_2_name = \"tagcn_128_1.pkl\"\n",
    "model_2 = models.TAGCN(num_features, 128, num_class, 1, activation=F.leaky_relu, dropout=0)\n",
    "model_2_states = th.load(os.path.join(model_dir, model_2_name), map_location=dev)\n",
    "model_2.load_state_dict(model_2_states)\n",
    "model_2 = model_2.to(dev)\n",
    "model_2.eval()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:27.624452Z",
     "start_time": "2020-09-10T07:04:25.567016Z"
    }
   },
   "outputs": [],
   "source": [
    "# Adj normalization\n",
    "if adj_norm:\n",
    "    adj = utils.adj_preprocess(raw_adj)\n",
    "else:\n",
    "    adj = raw_adj\n",
    "    \n",
    "graph = dgl.DGLGraph()\n",
    "graph.from_scipy_sparse_matrix(adj)\n",
    "features = th.FloatTensor(raw_features).to(dev)\n",
    "labels = th.LongTensor(raw_labels).to(dev)\n",
    "graph.ndata['features'] = features"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:28.325079Z",
     "start_time": "2020-09-10T07:04:27.625671Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Acc on train: 0.6022\n",
      "Acc on val: 0.5968\n"
     ]
    }
   ],
   "source": [
    "pred_1 = model_1.forward(graph, features)\n",
    "print(\"Acc on train: {:.4f}\".format(utils.compute_acc(pred_1[:size_reduced], labels, train_mask)))\n",
    "print(\"Acc on val: {:.4f}\".format(utils.compute_acc(pred_1[:size_reduced], labels, val_mask)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:28.416483Z",
     "start_time": "2020-09-10T07:04:28.326071Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Acc on train: 0.6864\n",
      "Acc on val: 0.6626\n"
     ]
    }
   ],
   "source": [
    "pred_2 = model_2.forward(graph, features)\n",
    "print(\"Acc on train: {:.4f}\".format(utils.compute_acc(pred_2[:size_reduced], labels, train_mask)))\n",
    "print(\"Acc on val: {:.4f}\".format(utils.compute_acc(pred_2[:size_reduced], labels, val_mask)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:28.452330Z",
     "start_time": "2020-09-10T07:04:28.417374Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "% of common predictions: 0.7492\n"
     ]
    }
   ],
   "source": [
    "pred_1_np = pred_1.cpu().detach().numpy()\n",
    "pred_2_np = pred_2.cpu().detach().numpy()\n",
    "print(\"% of common predictions: {:.4f}\".format(np.sum(np.argmax(\n",
    "    pred_1_np[-test_size:], 1) == np.argmax(pred_2_np[-test_size:], 1)) / test_size))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:28.457757Z",
     "start_time": "2020-09-10T07:04:28.453139Z"
    }
   },
   "outputs": [],
   "source": [
    "target_node = np.where(\n",
    "    np.argmax(pred_1_np[-test_size:], 1) == np.argmax(pred_2_np[-test_size:], 1))[0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 2: Adversarial Connections\n",
    "\n",
    "After having selected target nodes, it's time to consider how to use limited number of connections to obtain the maximum influences. Here, we show three stratagies to present the key insight of our idea progressively, as shown in the figure. The first strategy, a basic one, all attack nodes are directly connected to target nodes. In this case, one attack node can influence in maximum 100  neighbour nodes. The second strategy, a better one, there are inter-connections between attack nodes. One attack node can now affect other attack nodes, thus more targets. The third strategy, push this further, some attack nodes are only connected to other attack nodes in a multi-layer fashion. Using this strategy, we can make the best use of limited connections to influence tagert nodes. \n",
    "\n",
    "As for how to choose the connection between specific attack node and target node, our answer is: choose it randomly. Indeed, we did a lot of work to find if there are some useful information related to the topological properties (e.g. degree, centrality, betweenness, etc.). However, we find the randomness is better than hand-crafted design. Besides, the attack performance is surprisingly stable. One possible reason is the isomorphy of graph. Initially all attack nodes are zeros, so there are no difference between them. After the connections are determined, their features are modified by the following attack algorithm. Hence, this process may result in some isomorphic graphs (or partially isomorphic) even considering the initialization is random."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:28.469639Z",
     "start_time": "2020-09-10T07:04:28.458479Z"
    }
   },
   "outputs": [],
   "source": [
    "def get_noise_list(adj, K, target_noise, noise_tmp_list):\n",
    "    i = 1\n",
    "    res = []\n",
    "    while len(res) < K and i < len(noise_tmp_list):\n",
    "        if adj[target_noise, noise_tmp_list[i]] == 0:\n",
    "            res.append(noise_tmp_list[i])\n",
    "        i += 1\n",
    "\n",
    "    return res\n",
    "\n",
    "\n",
    "def update_noise_active(noise_active, noise_edge, threshold=100):\n",
    "    for node in noise_active:\n",
    "        if noise_edge[node] >= threshold:\n",
    "            noise_active.pop(noise_active.index(node))\n",
    "    return noise_active\n",
    "\n",
    "\n",
    "def connect(test_node_list, max_connection, mode):\n",
    "    adj = np.zeros((500, 50500))\n",
    "    N = len(test_node_list)\n",
    "\n",
    "    if mode == 'random-inter':\n",
    "        # test_node_list: a list of test nodes to be connected\n",
    "        noise_edge = np.zeros(500)\n",
    "        noise_active = [i for i in range(500)]\n",
    "\n",
    "        # create edges between noise node and test node\n",
    "        for i in range(N):\n",
    "            if not(noise_active):\n",
    "                break\n",
    "            noise_list = np.random.choice(noise_active, 1)\n",
    "            noise_edge[noise_list] += 1\n",
    "            noise_active = update_noise_active(noise_active, noise_edge)\n",
    "            adj[noise_list, test_node_list[i]] = 1\n",
    "\n",
    "        # create edges between noise nodes\n",
    "        for i in range(len(noise_active)):\n",
    "            if not noise_active:\n",
    "                break\n",
    "            noise_tmp_list = sorted(noise_active, key=lambda x: noise_edge[x])\n",
    "            target_noise = noise_tmp_list[0]\n",
    "            K = 100 - noise_edge[target_noise]\n",
    "            noise_list = get_noise_list(adj, K, target_noise, noise_tmp_list)\n",
    "\n",
    "            noise_edge[noise_list] += 1\n",
    "            noise_edge[target_noise] += len(noise_list)\n",
    "\n",
    "            noise_active = update_noise_active(noise_active, noise_edge)\n",
    "            if noise_list:\n",
    "                adj[target_noise, 50000 + np.array(noise_list)] = 1\n",
    "                adj[noise_list, 50000 + target_noise] = 1\n",
    "\n",
    "    elif mode == 'multi-layer':\n",
    "        # test_node_list: a list of test nodes to be connected\n",
    "        noise_edge = np.zeros(500)\n",
    "        noise_active = [i for i in range(455)]\n",
    "\n",
    "        # create edges between noise node and test node\n",
    "        for i in range(N):\n",
    "            if not(noise_active):\n",
    "                break\n",
    "            noise_list = np.random.choice(noise_active, 1)\n",
    "            noise_edge[noise_list] += 1\n",
    "            noise_active = update_noise_active(\n",
    "                noise_active, noise_edge, threshold=90)\n",
    "            adj[noise_list, test_node_list[i]] = 1\n",
    "\n",
    "        # create edges between noise nodes\n",
    "        for i in range(len(noise_active)):\n",
    "            if not noise_active:\n",
    "                break\n",
    "            noise_tmp_list = sorted(noise_active, key=lambda x: noise_edge[x])\n",
    "            target_noise = noise_tmp_list[0]\n",
    "            K = 90 - noise_edge[target_noise]\n",
    "            noise_list = get_noise_list(adj, K, target_noise, noise_tmp_list)\n",
    "\n",
    "            noise_edge[noise_list] += 1\n",
    "            noise_edge[target_noise] += len(noise_list)\n",
    "\n",
    "            noise_active = update_noise_active(\n",
    "                noise_active, noise_edge, threshold=90)\n",
    "\n",
    "            if noise_list:\n",
    "                adj[target_noise, 50000 + np.array(noise_list)] = 1\n",
    "                adj[noise_list, 50000 + target_noise] = 1\n",
    "\n",
    "        noise_active_layer2 = [i for i in range(45)]\n",
    "        noise_edge_layer2 = np.zeros(45)\n",
    "        for i in range(455):\n",
    "            if not(noise_active_layer2):\n",
    "                break\n",
    "            noise_list = np.random.choice(noise_active_layer2, 10)\n",
    "            noise_edge_layer2[noise_list] += 1\n",
    "            noise_active_layer2 = update_noise_active(\n",
    "                noise_active_layer2, noise_edge_layer2, threshold=100)\n",
    "            adj[noise_list + 455, i + 50000] = 1\n",
    "            adj[i, noise_list + 50455] = 1\n",
    "\n",
    "    else:\n",
    "        print(\"Mode ERROR: 'mode' should be one of ['random-inter', 'multi-layer']\")\n",
    "\n",
    "    return adj"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:35.501086Z",
     "start_time": "2020-09-10T07:04:28.470337Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<500x50500 sparse matrix of type '<class 'numpy.float64'>'\n",
       "\twith 49238 stored elements in Compressed Sparse Row format>"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "adj_attack = connect(target_node, max_connection=90, mode='multi-layer')\n",
    "adj_attack = sp.csr_matrix(adj_attack)\n",
    "adj_attack"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:37.001793Z",
     "start_time": "2020-09-10T07:04:35.501996Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<500x660074 sparse matrix of type '<class 'numpy.float64'>'\n",
       "\twith 49238 stored elements in Compressed Sparse Row format>"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# conatnate to required size\n",
    "adj_adv = sp.hstack([sp.csr_matrix(np.zeros([500, size_raw - 50000])), adj_attack])\n",
    "adj_adv = sp.csr_matrix(adj_adv)\n",
    "adj_adv"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:37.006083Z",
     "start_time": "2020-09-10T07:04:37.002672Z"
    },
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "No more than 100 edges for any attack node? True\n",
      "Symmetric attack matrix? True\n"
     ]
    }
   ],
   "source": [
    "# sanity check\n",
    "print(\"No more than 100 edges for any attack node?\",\n",
    "      adj_adv.getnnz(axis=1).max() <= 100)\n",
    "print(\"Symmetric attack matrix?\", bool(\n",
    "    ~(adj_adv[:, size_raw:].T != adj_adv[:, size_raw:]).sum()))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Enhanced Feature Gradient Attack\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 1: Initialization\n",
    "\n",
    "Adversarial adjcent matrix generation + zeros features + attack target selection \n",
    "\n",
    "Since the targeted attack performs better than the untargeted attack (easier to optimize), we also consider conducting the targetd attack, where the attack target class for a node is the least probable class."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:37.583493Z",
     "start_time": "2020-09-10T07:04:37.006908Z"
    }
   },
   "outputs": [],
   "source": [
    "# adjacent matrix \n",
    "raw_adj_adv = sp.vstack([raw_adj, adj_adv[:, :size_raw]])\n",
    "raw_adj_adv = sp.hstack([raw_adj_adv, adj_adv.T])\n",
    "if adj_norm:\n",
    "    raw_adj_adv = utils.adj_preprocess(raw_adj_adv)\n",
    "    \n",
    "# zeros features\n",
    "feat_adv = np.zeros((500, 100))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:37.586143Z",
     "start_time": "2020-09-10T07:04:37.584468Z"
    }
   },
   "outputs": [],
   "source": [
    "# target model configuration\n",
    "model_type = 'gcn'\n",
    "model_name = \"gcn_64_1.pkl\"\n",
    "num_hidden = 64\n",
    "num_layers = 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:37.595534Z",
     "start_time": "2020-09-10T07:04:37.586948Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "GCN(\n",
       "  (layers): ModuleList(\n",
       "    (0): GraphConv(in=100, out=64, normalization=both, activation=<function leaky_relu at 0x7f5f304d2280>)\n",
       "    (1): GraphConv(in=64, out=20, normalization=both, activation=None)\n",
       "  )\n",
       "  (dropout): Dropout(p=0, inplace=False)\n",
       ")"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "if model_type == 'gcn':\n",
    "    model = models.GCN(num_features, num_hidden, num_class, num_layers,\n",
    "                       activation=F.leaky_relu, dropout=0)\n",
    "elif model_type == 'tagcn':\n",
    "    model = models.TAGCN(num_features, num_hidden, num_class, num_layers,\n",
    "                         activation=F.leaky_relu, dropout=0)\n",
    "    \n",
    "model_states = th.load(os.path.join(model_dir, model_name), map_location=dev)\n",
    "model.load_state_dict(model_states)\n",
    "model = model.to(dev)\n",
    "model.eval()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:39.409199Z",
     "start_time": "2020-09-10T07:04:37.596322Z"
    }
   },
   "outputs": [],
   "source": [
    "# prediction on raw graph (without attack nodes)\n",
    "raw_graph = dgl.DGLGraph()\n",
    "raw_graph.from_scipy_sparse_matrix(adj)\n",
    "features = th.FloatTensor(raw_features).to(dev)\n",
    "labels = th.LongTensor(raw_labels).to(dev)\n",
    "raw_graph.ndata['features'] = features\n",
    "pred_raw = model.forward(raw_graph, features)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:39.413069Z",
     "start_time": "2020-09-10T07:04:39.410237Z"
    }
   },
   "outputs": [],
   "source": [
    "# select the least probable class as the target class\n",
    "pred_raw_label = th.argmax(pred_raw[:size_raw][test_mask], 1)\n",
    "pred_test_prob = th.softmax(pred_raw[:size_raw][test_mask], 1)\n",
    "attack_label = th.argsort(pred_test_prob, 1)[:, 2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Step 2: Enhanced Gradient attack\n",
    "\n",
    "We design the following loss function for the targeted attack:\n",
    "$$L=f(x)_{c_0}-\\max f(x)_{c_{new}\\neq c_0} + k\\sum_{c\\in C}f(x)_{c_0}log(f(x)_c)$$\n",
    "where $f$ is the softmax output of the model, $c_0$ is original predicted class, $k$ is a constant determines the proportion of two parts of the combination of two losses (i.e. **Callini-Wagner** loss and **Cross Entropy** loss). At each iteration, we calculate the gradient of the attack loss w.r.t. features of attack nodes. Then we use the **Adadelta** to modify the features as learning parameters and to optimize the attack loss. **Attention:** Other optimization methods are also possible, while the magnitude of learning rate may vary a lot."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:40.951790Z",
     "start_time": "2020-09-10T07:04:39.413875Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "tensor([[0., 0., 0.,  ..., 0., 0., 0.],\n",
       "        [0., 0., 0.,  ..., 0., 0., 0.],\n",
       "        [0., 0., 0.,  ..., 0., 0., 0.],\n",
       "        ...,\n",
       "        [0., 0., 0.,  ..., 0., 0., 0.],\n",
       "        [0., 0., 0.,  ..., 0., 0., 0.],\n",
       "        [0., 0., 0.,  ..., 0., 0., 0.]], device='cuda:0', requires_grad=True)"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# graph construction\n",
    "graph_adv = dgl.DGLGraph()\n",
    "graph_adv.from_scipy_sparse_matrix(raw_adj_adv)\n",
    "features = th.FloatTensor(raw_features).to(dev)\n",
    "features_adv = th.FloatTensor(feat_adv).to(dev)\n",
    "features_adv.requires_grad_(True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:40.956116Z",
     "start_time": "2020-09-10T07:04:40.953673Z"
    }
   },
   "outputs": [],
   "source": [
    "# attack configuration\n",
    "lr = 1\n",
    "k = 1000\n",
    "epoch = 100\n",
    "feat_lim = 2.0\n",
    "optimizer = th.optim.Adadelta(\n",
    "    [features_adv], lr=lr, rho=0.9, eps=1e-06, weight_decay=0)\n",
    "\n",
    "# other possible optimizers\n",
    "# optimizer = th.optim.Adam([features_ae], lr=lr)\n",
    "# optimizer = th.optim.Adagrad([features_adv], lr=lr, lr_decay=0,\n",
    "#                              weight_decay=0, initial_accumulator_value=0, eps=1e-10)\n",
    "# optimizer = th.optim.SGD([features_adv], lr=lr)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:45.918097Z",
     "start_time": "2020-09-10T07:04:40.957049Z"
    },
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 0, Loss: 31093.8203, Test acc: 0.9713\n",
      "Epoch 10, Loss: 30741.4980, Test acc: 0.9662\n",
      "Epoch 20, Loss: 30226.7559, Test acc: 0.9454\n",
      "Epoch 30, Loss: 29428.2500, Test acc: 0.9001\n",
      "Epoch 40, Loss: 28276.7109, Test acc: 0.8340\n",
      "Epoch 50, Loss: 26877.5781, Test acc: 0.7652\n",
      "Epoch 60, Loss: 25443.9766, Test acc: 0.7036\n",
      "Epoch 70, Loss: 24132.9941, Test acc: 0.6510\n",
      "Epoch 80, Loss: 23008.9941, Test acc: 0.6092\n",
      "Epoch 90, Loss: 22066.2109, Test acc: 0.5754\n"
     ]
    }
   ],
   "source": [
    "for i in range(epoch):\n",
    "    features_concat = th.cat((features, features_adv), 0)\n",
    "    features_concat = feat_norm_func(features_concat)\n",
    "    graph_adv.ndata['features'] = features_concat\n",
    "    pred_adv = model(graph_adv, features_concat)\n",
    "    pred_loss_ce = - \\\n",
    "        F.nll_loss(pred_adv[:size_raw][test_mask], pred_raw_label).cpu()\n",
    "    pred_adv_prob = th.softmax(pred_adv[:size_raw][test_mask], 1).cpu()\n",
    "    pred_loss_cw = (pred_adv_prob[[np.arange(50000), pred_raw_label]] - pred_adv_prob[\n",
    "        [np.arange(50000), attack_label]]).sum()\n",
    "    pred_loss = pred_loss_cw + k * pred_loss_ce\n",
    "\n",
    "    optimizer.zero_grad()\n",
    "    pred_loss.backward(retain_graph=True)\n",
    "    optimizer.step()\n",
    "\n",
    "    with th.no_grad():\n",
    "        features_adv.clamp_(-feat_lim, feat_lim)\n",
    "\n",
    "    if i % 10 == 0:\n",
    "        print(\"Epoch {}, Loss: {:.4f}, Test acc: {:.4f}\".format(i, pred_loss,\n",
    "            utils.compute_acc(pred_adv[:size_raw][test_mask], pred_raw_label)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:45.936948Z",
     "start_time": "2020-09-10T07:04:45.919151Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Feature range [-2.00, 2.00]\n",
      "Acc on train: 0.5907\n",
      "Acc on val: 0.5843\n",
      "Acc on test(compared with raw predictions): 0.5508\n"
     ]
    }
   ],
   "source": [
    "print(\"Feature range [{:.2f}, {:.2f}]\".format(\n",
    "    features_adv.min(), features_adv.max()))\n",
    "print(\"Acc on train: {:.4f}\".format(utils.compute_acc(\n",
    "    pred_adv[:size_reduced][train_mask], labels[train_mask])))\n",
    "print(\"Acc on val: {:.4f}\".format(utils.compute_acc(\n",
    "    pred_adv[:size_reduced][val_mask], labels[val_mask])))\n",
    "print(\"Acc on test(compared with raw predictions): {:.4f}\".format(utils.compute_acc(\n",
    "    pred_adv[:size_raw][test_mask], pred_raw_label)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-09-10T07:04:45.941675Z",
     "start_time": "2020-09-10T07:04:45.937900Z"
    }
   },
   "outputs": [],
   "source": [
    "# save adversarial adjacent matrix and adversarial features\n",
    "with open(os.path.join(data_dir, \"adj_adv.pkl\"), \"wb\") as f:\n",
    "    pickle.dump(adj_adv, f)\n",
    "np.save(os.path.join(data_dir, \"features_adv.npy\"), features_adv.detach().cpu().numpy())"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
